{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Read a DataFrame From an Excel File" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas has a read_excel function which can be used to create a DataFrame from an Excel file. " ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StockPriceDate
0MSFT1002019-01-01
1MSFT1102019-02-01
2MSFT1052019-03-01
3MSFT1122019-04-01
4AAPL5002019-01-01
5AAPL5122019-02-01
6AAPL4822019-03-01
7AAPL5252019-04-01
\n", "
" ], "text/plain": [ " Stock Price Date\n", "0 MSFT 100 2019-01-01\n", "1 MSFT 110 2019-02-01\n", "2 MSFT 105 2019-03-01\n", "3 MSFT 112 2019-04-01\n", "4 AAPL 500 2019-01-01\n", "5 AAPL 512 2019-02-01\n", "6 AAPL 482 2019-03-01\n", "7 AAPL 525 2019-04-01" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_excel('Stock Data.xlsx')\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By default, it will load the first sheet. However, we can pass a name of a sheet to get that sheet instead." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StockPriceDate
0AMZN302019-01-01
1AMZN352019-02-01
2AMZN282019-03-01
3AMZN322019-04-01
4FB9002019-01-01
5FB9102019-02-01
6FB9202019-03-01
7FB9302019-04-01
\n", "
" ], "text/plain": [ " Stock Price Date\n", "0 AMZN 30 2019-01-01\n", "1 AMZN 35 2019-02-01\n", "2 AMZN 28 2019-03-01\n", "3 AMZN 32 2019-04-01\n", "4 FB 900 2019-01-01\n", "5 FB 910 2019-02-01\n", "6 FB 920 2019-03-01\n", "7 FB 930 2019-04-01" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_excel('Stock Data.xlsx', sheet_name='The Second Sheet')\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Write a DataFrame to an Excel File" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## WARNING\n", "\n", "Do not output to an existing workbook, as the original workbook will be overwritten. There is no undo!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas provides `DataFrame.to_excel` to create Excel workbooks from DataFrames. But first, let's modify the data to ensure we're outputting the current `DataFrame`. We'll go ahead and add a column." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StockPriceDateCustom Column
0AMZN302019-01-01stuff
1AMZN352019-02-01stuff
2AMZN282019-03-01stuff
3AMZN322019-04-01stuff
4FB9002019-01-01stuff
5FB9102019-02-01stuff
6FB9202019-03-01stuff
7FB9302019-04-01stuff
\n", "
" ], "text/plain": [ " Stock Price Date Custom Column\n", "0 AMZN 30 2019-01-01 stuff\n", "1 AMZN 35 2019-02-01 stuff\n", "2 AMZN 28 2019-03-01 stuff\n", "3 AMZN 32 2019-04-01 stuff\n", "4 FB 900 2019-01-01 stuff\n", "5 FB 910 2019-02-01 stuff\n", "6 FB 920 2019-03-01 stuff\n", "7 FB 930 2019-04-01 stuff" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['Custom Column'] = 'stuff'\n", "df" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's write this to a workbook. You just need to specify what to call the new workbook. You must end it with `.xlsx` for it to be a valid workbook." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "df.to_excel('New Book.xlsx')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "After running the contents to here, you should now see `New Book.xlsx` in the same folder as this notebook. Open it and verify that the contents match the `DataFrame`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You'll notice that the created workbook has the 0, 1, 2... index alongside it, which we don't want. Just pass `index=False` to get rid of it. Make sure you close the workbook before trying to write to it again, or you'll get an error." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "df.to_excel('New Book.xlsx', index=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now the data looks like we want. The only thing is it's currently outputting to a sheet named \"Sheet1\", but we may want a better name. You can provide it here." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "df.to_excel('New Book.xlsx', index=False, sheet_name='Stock Data')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now it has a proper sheet name and the data displays properly." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Advanced" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The following will not be covered in class, but may be useful for you." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Writing to Multiple Sheets" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "While in general, writing to a workbook replaces it, there is a way to write multiple sheets to a workbook, through the `ExcelWriter`. But first let's create a second `DataFrame` to write." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
StockPriceDateCustom Column
0AMZN302019-01-01other stuff
1AMZN352019-02-01other stuff
2AMZN282019-03-01other stuff
3AMZN322019-04-01other stuff
4FB9002019-01-01other stuff
5FB9102019-02-01other stuff
6FB9202019-03-01other stuff
7FB9302019-04-01other stuff
\n", "
" ], "text/plain": [ " Stock Price Date Custom Column\n", "0 AMZN 30 2019-01-01 other stuff\n", "1 AMZN 35 2019-02-01 other stuff\n", "2 AMZN 28 2019-03-01 other stuff\n", "3 AMZN 32 2019-04-01 other stuff\n", "4 FB 900 2019-01-01 other stuff\n", "5 FB 910 2019-02-01 other stuff\n", "6 FB 920 2019-03-01 other stuff\n", "7 FB 930 2019-04-01 other stuff" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = df.copy()\n", "df2['Custom Column'] = 'other stuff'\n", "df2" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that we have two `DataFrame`s, let's write them to the same workbook. Here we must use the `with` syntax and instead of passing the file path to each `to_excel` command, we will pass the writer." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "with pd.ExcelWriter('New Book.xlsx') as writer:\n", " df.to_excel(writer, sheet_name='First Df', index=False)\n", " df2.to_excel(writer, sheet_name='Second Df', index=False)\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you should see both `DataFrame`s as sheets in the workbook." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.3" } }, "nbformat": 4, "nbformat_minor": 4 }